Mon. Not. R. Astron. Soc. 000,[DQ0](2012) Printed 29 March 2012 (MN KTrX style file v2.2) 



On the modelling of the excesses of galaxy clusters over high-mass 
thresholds 



O 

<N 



J.-C. Waizmann 1,2,3 *, S. Ettori 2,3 and L. Moscardini 1,2,3 

1 Dipartimento di Astronomia, Universita di Bologna, via Ranzani 1, 40127 Bologna, Italy 
2 INAF - Osservatorio Astronomico di Bologna, via Ranzani 1, 40127 Bologna, Italy 



INFN, Sezione di Bologna, viale Berti Pichat 6/2, 40127 Bologna, Italy 



Accepted 2012 March 2. Received 2012 February 28; in original form 2012 January 17 



oo 

o 

u 

CM 

o 

(N 
> 

o 



ABSTRACT 

In this work we present for the first time an application of the Pareto approach to the mod- 
elling of the excesses of galaxy clusters over high-mass thresholds. The distribution of those 
excesses can be described by the generalized Pareto distribution (GPD), which is closely re- 
lated to the generalized extreme value (GEV) distribution. After introducing the formalism, 
we study the impact of different thresholds and redshift ranges on the distributions, as well 
as the influence of the survey area on the mean excess above a given mass threshold. We 
also show that both the GPD and the GEV approach lead to identical results for rare, thus 
high-mass and high-redshift, clusters. As an example, we apply the Pareto approach to ACT- 
CL JO 102-49 15 and SPT-CL J2 106-5 844 and derive the respective cumulative distribution 
functions of the exceedance over different mass thresholds. We also study the possibility to 
use the GPD as a cosmological probe. Since in the maximum likelihood estimation of the 
distribution parameters all the information from clusters above the mass threshold is used, the 
GPD might offer an interesting alternative to GEV-based methods that use only the maxima in 
patches. When comparing the accuracy with which the parameters can be estimated, it turns 
out that the patch-based modelling of maxima is superior to the Pareto approach. In an ideal 
case, the GEV approach is capable to estimate the location parameter with a percent level 
precision for less than -100 patches. This result makes the GEV based approach potentially 
also interesting for cluster surveys with a smaller area. 

Key words: methods: statistical - galaxies: clusters: general - galaxies: clusters: individ- 
ual: ACT-CL J0102-4915 - galaxies: clusters: individual: SPT-CL J2106-5844- cosmology: 
miscellaneous. 



1 INTRODUCTION 

Extreme value statist ics (EVS), pioneered by the works of 
iFisher & Tipped dl928h and lGnedenkol dl943h . is a branch of statis- 
tics that deals with the statistical modelling of extreme events that 
substantially deviate from the mean behaviour. The principal char- 
acteristics of EVS is the fact that, for independently identically dis- 
tributed (i.i.d.) random variables, the distribution of the extrema 
converges to a member of the generalized extreme value (GEV) 
distribution. 

While EVS being widely spread in the environmental and eco- 
nomic sciences, it has not seen many applications to astrophysics 
so far. For the first time, EVS was applie d to the study of the statis - 
tics of the brightest cluster galaxies by iBhavsar & B arrow! (1985) 
and su bsequently to the temperature maxima in the CMB by [C oles 
419881). It has also been applied to the solar cycle dAsensio Ramosl 
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l2007b and to solar radio bursts jRosa et alj|2010h and in a cosm o- 
logical context to Gaussian random fields ( Colomb i et al 

Recently, mainly triggered by the detection of very massive 
galax y clusters at high redshifts like XMMU J2235.325 57 at z = 
1.4 dMullis et al.l2005l:lRosati et al.l2009l;|jee et alJ2009h. ACT -CL 



J0102 atz = 0.87 jMarriage et al.l201ll:lMenanteau et al.l2012h and 
SPT-C L J2106 at z = 1.132 dFolev et all 1201 ll : IWilliamson et al.1 
l201lh . the applica tion of EVS on ma ssive clusters has been studied 
in several works. iDavis et all j201ll) related for the first time the 
GEV distribution parameters to cosmological quantities and com- 
pared the approach to numerical N-body simulations. The impact of 
primordial non-Gaussianity on th e EVS of galaxy clusters has been 
studied by IChongchitnan & Silkl (120 121) . A direct approach, based 
on the exact rather than the as ymptotic form, has been utilized by 
Harrison & Coles (2 0lTll2012h to study the halo mass function and 
the p ossibility to use ext reme clusters to test cosmological mod- 
els. Waizmann et a n J20T 1) proposed to utilise the GEV distribu- 
tion as a cosmological probe by dividing the survey area in small 
equally sized patches allowing to reconstruct the distribution of 
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Figure 1. Illustrative scheme of the exceedance model. Here t\ and ^ illus- 
trate two different thresholds in cluster mass and yj , yf are the correspond- 
ing exceedances over the respective threshold for an arbitrary cluster with 
tag i. 



the most massive haloe s in those patches. GEV was also used in 
Waizmann et al ] d2Q12h to show that none of the known massive 
clusters alone is in conflict with ACDM and that there is no indica- 
tion of high-z clusters to be rarer than low-z ones. 

In this work we study the application of an alternative ap- 
proach in the framework of EVS, which is based on the statistical 
modelling of the distribution of excesses (hereafter referr ed also to 
as exceedances) over a high thresh old. It has been shown dPickandsl 
ll975l : lBalkema & De Hahnlll974b that for high thresholds the dis- 
tribution of exceedances converges to the generalized Pareto dis- 
tribution (GPD) which is closely related to the GEV distribution. 
The GPD distribution function allows to infer the probability that 
a given observation exceeds a high threshold by a certain amount 
and hence we utilise this approach to derive the distribution of the 
exceedances in mass of galaxy clusters over a high-mass thresh- 
old. Since surveys based on the Sunyaev-Zeldovich (SZ)-effect 
dSunvaev & Zel dovich 1972, 1980), like the South Pole Telescope 
(SPT) survey dCarlstrom et al.l 2011) for instance, can be consid- 
ered to be mass-limited, we study as well whether a GPD based 
approach could be utilized as a cosmological probe. 

This paper is structured according to the following scheme. 
In Section [2] we introduce extreme value statistics by discussing 
the application of the Gnedenko approach to massive galaxy clus- 
ters in Section [TT] This is followed by an introduction to the mod- 
elling of exceedances with the Pareto approach and its connection 
to the Gnedenko approach in Section [Z2l In Section[3] we apply the 
Pareto approach to massive high-z clusters in general. This is fol- 
lowed by an example application to two observed clusters in Sec- 
tion|4l where in Section |4~T1 we discuss several observational effects 
that have to be taken into account and in Section |4~2l we present the 
results of this exercise. After this analysis, we discuss in Section [5] 
the possible application of exceedance models to SZ cluster sur- 
veys as a cosmological probe and compare it with a GEV based 
approach. Then, we summarise our findings in the conclusions in 
Section [6l 



2 EXTREME VALUE STATISTICS 



Extreme value statistics ( for an introduction see e.g. iGu mbel 
dl958h : iKotz & Nadaraiafl bOOCh : IColesI bOOlh : iReiss & Thomas 



( 2007)) concerns with the stochastic behaviour of the extremes (in 
what follows we consider only maxima) of i.i.d. random variables. 
In this sense EVS tries to model the unlikely and to give a quantita- 
tive answer to the question how frequent unusual observations are. 
There are two different approaches that are utilized in the literature 
and that will be briefly summarized in the following. 



2.1 The Gnedenko approach 

The Gnedenko approach deals with the modelling of the block max- 
ima M n of i.i.d. random variables X u which are defined as 



M n = max(Xi, . ..X n ). 



(1) 



It has been shown dFisher & Tipped 1 19281 : lGnedenkolll943b that 
for n — > oo the limiting cumulative distribution function (CDF) 
of the renormalised block maxima is given by one of the extreme 
value families: Gumbel (Type I), Frechet (Type II) o r We ibull (Type 
III). A s independently shown by Ivon Misesl 1 1954b and lJenkinsonl 
dl955b . these three families can be unified as a general extreme 
value distribution (GEV) 



F GEV (x;a,J3,y) = e' 



_ pq(,x) 



with 



q(x) ■■ 



-[i + y(T)] 

e -(x-a)/J3 



-1/7 



for y ± 0, 
for 7 = 0, 



(2) 



(3) 



with the location, scale and shape parameters or, p and y. In this 
generalisation, y = corresponds to the Type I, y > to Type II and 
y < to the Type III distributions. The corresponding probability 
density function (PDF) is given by 



/gev (*; a,p,y) ■ 



dF GE y(*; a,fr y) 
dx 



(4) 



From now on we will adopt the convention that capital initial let- 
ters denote the CDF (like F G ev (x; a,p,y)) and small initial letters 
denote the PDF (like / G ev (*; a,0 9 y)). 

A formalism for the application of GEV to the most m assive 
galaxy clusters has been introduced by iDavis et all (1201 lb and is 
briefly summarized in the following. By introducing the random 
variable u = log 10 (m), the CDF of the most massive halo reads 



Pr{u n 



Jo 



ii" 



(5) 



This probability has to be equal to the one of finding no halo with a 
mass larger than u. On scales (> lOOMpc/h) for which the cluster- 
ing between galaxy clusters can be neglected, the CDF is given by 
the Po isson distribution for the case of zero occurrence (IDavis et al.l 

ImJ): 



P (U) : 



-n e $(>u)V 



(6) 



where ft e ff(> u) is the effective comoving number density of halos 
above mass u = log 10 (m) obtained by averaging and V is the co- 
moving volume. By assuming that equation (|5j can be modelled 
by ^gev (w; y), it is possible to relate the GEV parameters to 
cosmological quantities by Taylor-expanding both F G ev (u; a,/3,y) 
and Po(u) around the peaks of the corresponding PDFs. By com- 
paring the individual first two expansion terms with each other, one 
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finds dPavis etalJ201lh 

y = ft eff (> m )V - 1, (3 = 

a = log 10 m 



(1 + y) (1+ ^ 
^1 Vmolnio' 



-[(1+yr-H 



(7) 



where ra is the most likely maximum mass and dn eS /dm\ mQ is the 
effective mass function evaluated at ra which relates to the effec- 
tive number density n e ff(> m ) via 



dm 



dn e ff(> m) 



dm 



(8) 



The most likely mass, mo, can be found dPavis et al.ll2oTTh by per- 
forming a root search on 

\2 



dm 



+ m 



d 2 ^efF 



dm 2 



+ m V 



dm 



: 0. 



(9) 



For calculat i ng n e ^ we utilized the mass function introduced by 
iTinker et all fo008h and fix the cosmology to (h, Q A0 , Q m o, <x 8 ) = 
(0.7, 0.73, 027, 0.81) b ased on the W ilkinson Microwave Anisotropy 
7-yr (WMAP1) results dKomatsu et al.ll201lh . 
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Figure 2. Relation between the CDFs of the GEV (jc-axis) and the GPD 
(y-axis) approach under the assumption of t = a. The dotted line illustrates 
the 1 : 1 relation and the solid line depicts the true relation according to 
equation fist . The vertical lines show the deviations from the 1 : 1 relation 
for different values of the CDF based on the GEV approach. 



2.2 The Pareto approach 

Exceedance theor y spreads, origina t ing mainly from hydr ological 
literature (see e.g. Fitzgerald ( 1989); Balkema & De Hahn ( 1974)), 
into many fields of applied statistics. The basic notion is that, in- 
stead of studying the distribution of the maxima in blocks of ran- 
dom variables as in equation (Q]), an alternative view is to consider 
realisations X t drawn from an underlying distribution F as extreme, 
if they exceed a very high threshold t as depicted in Fig. Q] Thus, 
one is interested in the conditional probability 

1 - Fit + y) 

Pr{X >t + y\ X>t}- -, for y > 0, (10) 

1 - F(t) 

where y denotes the ex ceedance over the threshold t. It has been 
shown jPickandslll975l) that, for very high thresholds, if the block 
maxima have an approximative distribution F GEY (x;a,fi,y), the 
distribution of the exceedances can be approximated by the gen- 
eralized Pareto distribution (GPD), given by 



-1/K 



for 
for 



k* 0, 

K = 0, 



(11) 



where the GPD parameters are related to the GEV parameters via 

K = y, (12) 
P=P + y(t-a\ (13) 



and the exceedance y is given by 

y = x - t if x > t. 



(14) 



Here all the parameters y, a and J3 are identical to the GEV parame- 
ters introduced before. For t = a the GPD and the GEV distribution 
are related by 



^gpd (y;P, k) = 1 + in f G ev (*; a,p, y) , 



(15) 



if lnF G Ev (x;a,fi,y) > -1. In this way, once the GEV parame- 
ters from equation are determined, also the GPD is fully deter- 
mined. For small existence probabilities (F G ev (x;a,/3,y) close to 
1), both distributions will coincide according to equation i\5l . This 
is shown in Fig. [2 from which it can be inferred that for clusters 



with an existence probability of less than ~ 10 per cent the two 
distributions differ by less than 1 per cent. For galaxy clusters that 
are more likely to be found, the CDFs of both distribution start to 
significantly deviate from each other, such that, for a cluster with 
a ~ 50 per cent existence probability, the deviation is larger than 
25 per cent. It is important at this point to note that both CDFs are 
correct in the sense that they just give answers to different statisti- 
cal questions. The GEV distribution is the distribution of the most 
massive cluster to be found in a given cosmic volume, whereas the 
GPD is the distribution of the exceedance of all clusters above a 
high-mass threshold under the condition that the threshold is ex- 
ceeded. In the case of rare clusters both distributions give identical 
answers, but for less rare clusters they can significantly differ. 

Usually, when dealing with data based on an unknown under- 
lying distribution and if the GPD parameters have to be determined 
from the data, then an appropriate choice of the threshold becomes 
crucial. When chosen too low, the limit law of the GPD is violated 
resulting in a bias; if it is chosen too high, data are so sparse that the 
variance in the parameter estimation is large. Since we will deter- 
mine the GPD parameters via the formalism introduced in Sect. 12.11 
we will study thresholds in the vicinity or above of the peak of the 
PDF inferred from the GEV. In this way, the limit law of the GPD 
can be assumed to be valid. This choice of the threshold is conser- 
vative and we do not attempt in this work to estimate the lowest 
possible threshold, which is usually done in an empirical way from 
the data. In principle there are two ways for the threshold estimation 
(see e.g. lColesl (2001)): the first one is based on the notion that the 
mean excess should approximately be a linear function of t if the 
limit law is fulfilled and the second method is based on the expected 
stability of the estimated shape parameter for high enough thresh- 
olds. To assess the lowest possible choice of t one would need to 
perform the analysis on a numerically simulated light cone, which 
we intend to do in a further study. 



3 MODELLING EXCEEDANCES 

It is very difficult to test cosmological models based on a single 
survey patch and thus on a single observation of the most massive 
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Figure 3. CDFs (upper panels) and PDFs (lower panels) of the exceedance y for a survey area A s = 2 500 deg and three different lower redshift limits in the 
range 0.8 < z < 1.2 as indicated in the individual panels. Each black line corresponds to a different value of the threshold t that differ from each other by 
At = 0.1. The outer values are indicated by the red arrows. 
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Figure 4. Mean exceedance y in the redshift interval 1.0 < z < 1.5 as a 
function of the threshold t for different survey areas comprising 10 deg 2 , 
100 deg 2 , 2 500 deg 2 , 10 000 deg 2 and 40 000 deg 2 , as labelled in the panel. 



cluster in the survey area. One solution to this problem is either to 
use many patches and to try to reconstruct the CDF of the most 
massive systems for a given patch-size and redshift range, or to 
take a single patch, provided the survey is deep enough, and to ap- 
ply the exceedance approach introduced in Sect. 12.21 In doing so, 
instead of dividing the survey area into small patches, one uses all 
the information from objects above a given mass threshold, as illus- 
trated in Fig. Q] At this point the survey selection function comes 



into play, which can cause a big problem to this approach since 
the limiting mass is usually an increasing function with redshift, 
which makes it difficult to define a threshold above which all clus- 
ters can b e detected. An exception to this are surveys based on the 
SZ effect (Sunyaev & Zeldovich 1972, 1980), since those exhibit 
an alm ost constant limiting survey-mass (see e.g. ICarlstrom et al.l 
which would make them in principle ideally suited to an 
application of the exceedance theory. 



As a case study we decided to conside r a SPT-\ike set-up 
with A s = 2 500 deg 2 (ICarlstrom etal.ll201ll) and different lower 
redshift limits in the range of 0.8 < z < 1.2. We are modelling 
thresholds t = log 10 (m thres hoid), where m thres h id is given in M Q , with 
14.5 < t < 15.5 and compute the GPD distributions of the ex- 
ceedances above a given threshold from the individual GEV analy- 
sis, as discussed in Sect. 12.11 The resulting CDFs and PDFs of the 
exceedances are presented in Fig. [3] where the former are shown in 
the upper panels and the latter in the lower panels. The start and end 
values for t are indicated by the red arrows and the step- size used 
between each black curve is At = 0.1. Both the CDFs and the PDFs 
show, as expected, that the higher the threshold is the less probable 
high exceedances are. Similarly the same holds for increasing the 
lower redshift, as can be seen by inspecting the panels from left to 
right. 

Apart from the exceedance distributions themselves, it is also possi- 
ble to calculate the mean exceedance E and the variance S 2 , which 
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are simply given by 



E 

S 2 



1-7 



P 1 



(l-r) 2 (l-2y) 



(16) 
(17) 



For both relations the choice of the threshold enters via fi, defined in 
equation i\3l * and the second moment exists only if the condition 
y < 1/2 is fulfilled. This requirement is met for all cases of interest 
of this work. The dependence of the mean exceedance, E, on the 
choice of the threshold is shown in Fig.[4|for five different choices 
of the survey area, A s , between 10 deg 2 and 40 000 deg 2 and a red- 
shift range of 1.0 < z < 1.5 , which will be popul ated by future clus- 
ter surveys like EUCLID (lLaureiis et al.ll201lh for instance. The 
red dotted line illustrates the mean exceedance for the SPT survey 
area of 2 500 deg 2 . As expected, the mean exceedance is a decreas- 
ing function of the threshold, since the larger the threshold is the 
smaller the expected exceedances are. Of course, for a fixed thresh- 
old the larger survey area yields larger exceedances. Like the mean 
also the variance is a decreasing function of the threshold. 



4 AN EXAMPLE APPLICATION: SINGLE CLUSTERS IN 
THE ACT & SPT FIELD 

After having introduced the basic theory in the previous sections, 
we present now an example application of the exceedance theory 
to two single SZ cluster s in the At acama Cosmology Te lescope 
(ACT) dFowler et alJl2007h and SPT dCarlstrom etai]201lh fields. 
We decided for two very massive objects, namely ACT-CL JO 102 
and SPT-CL J2106. The f ormer one, also dubbe d "El Gordo", has 
recently been discovered (Mar riage ctaDbOllh by the ACT in its 
755 deg 2 field. This merging s ystem is currently th e most massive 
cluster observed at z > 0.6 dMenanteau et al.ll2012l) . By combining 
SZ, optical , X-ray and infrared data, the mass could be determined 
to be M 2 oom = (2.16 + 0.32) x 10 15 M Q at a spectroscopic redshift 
of z = 0.87. Due to the fact that ACT-CL J0102 lies in the overlap 
region of the ACT and SPT survey areas, we conservatively assign 
the combined survey area of 2 800 deg 2 to this system. 
The second o bject, SPT-CL J2106, has been detected b y the SPT 
collaboration dFolev et al.ll201ll : IWilliamson et alJl201lh m a sur- 
vey area of 2 500 deg 2 . With a mass estimate of M 2 oo m = (1.27 ± 
0.21) x 10 15 M Q at a spectroscopic redshift of z = 1.132, this ex- 
traordinary system is the most massive cluster at redshifts z > 1. 



4.1 Preliminary considerations 

Before performing a statistical analysis of single galaxy clusters, 
one has in general to account for two different effects that can sub- 
stantially change the results. 

(i) The correction for the Eddington bias (Eddington 1913): due 
to the steepness of the mass function at the high-mass end, it is more 
likely that lower mass systems scatter up than higher mass systems 
scatter down, resulting in a systematic shift to higher masses. Due 
to this effect, clusters appear to be rarer than th ey actually are. 

(ii) The bias discussed in iHotchkissI J20T l|) that stems from the 
a posteriori choice of the redshift interval for the statistical analy- 
sis. If the lower redshift boundary is set to the cluster redshift, one 
pushes the rareness of a given cluster to the maximum. However, 
a cluster of a given mass could have easily shown up at another 



redshift. If not accounted for, this bias, like the Eddington bias, lets 
clusters appear to be rarer than they are. 

The strategi es for correcting for th ese effects have been already 
discussed in Waizmann et al thus we will only briefly sum- 

maris e them at this point. W e correct for the Eddington bias, follow- 
in dMortonson et allfcoill) . by shifting the observed mass, M obs , to 
a corrected mass, M corr , by 



lnM Cf 



= lnM obs + -so 



2 

InM' 



(18) 



where € is the local slope of the mass function (dw/dlnM oc M e ) 
and cr lnM is the uncertainty in th e mass measurement (for more 
details see Waizman n et al.l J2012)). We account for the bias dis- 



cussed in lHotchkissI d201 ll) bv a priori choosing the redshift ranges 
z e [ziow»Zup]- In order to compare theory with observations on 
the same grounds, it is usually necessary to unify the mass defi- 
nitions. The observationally reported masses are frequently defined 
considering an overdensity computed with re spect to the critica l 
one (M2oocX whereas in the mass function of Tink er et"aD (2008) 
for instance, the mean background density (M 2 oo m ) is assumed. For 
the clusters we discuss in this work, no correction is necessary be- 
cause the obser ved masses are a l ready given in M 2 oom- Therefore, 
we adopt from I Waizmann et al.l d2012h for the mass of ACT-CL 

x 10 15 M and for SPT-CL J2106 



J0102avalue of 



1.85_Q22^ 1V ^ 1V1 G 



a value of M™ m = l.H^o x 1Ql5 M © that we wil1 use hereafter. 



4.2 Results 

The results of our GPD analysis are shown in Fig. [5] in which we 
present the CDFs (upper panels) and PDFs (lower panels) of the ex- 
ceedance over different thresholds t e {14.9, 15.0, 15.1} (from left 
to right) based on a combined survey area of 2 800 deg 2 and for dif- 
ferent lower redshift limits comprising zi ow e {0, 0.2, 0.4, 0.6, 0.8). 
The upper redshift limit is kept fixed at z up = 3, since it has only 
a weak impact on the distribution functions. The grey shaded ar- 
eas denote the uncertainties in the observed masses. As expected, 
ACT-CL J0102 sits further in the tail of the distributions than SPT- 
CL J2106, because its mass is higher and thus the exceedance is 
larger. With increasing the threshold the clusters move to smaller 
exceedanc es and become t hus more likely to be found. As dis- 
cussed in Hotchkiss j201ll) . the distributions are sensitive to the 
choice of zi ow in the sense that for smaller zi ow the clusters are more 
likely to be found. 

We also calculated the probability to find y < (m obs - t) for a 
fixed observed mass, m obs , as a function of threshold and present 
the results in Fig.[6]for ACT-CL J0102 (left panel) and for SPT-CL 
J2106 (right panel). For the former system, we use A s = 2 800 deg 2 
and the redshift interval 0.5 < z < 1.0. For the latter, we use 
A s = 2 500 deg 2 and the redshift interval 1.0 < z < 1.5. We 
chose these specific redshift intervals a priori in order to avoid 
the previous ly mentioned bias and t o be directly comparable with 
the study in I Waizmann et al.l (|2012). The black solid line denotes 
m obs = ^2oom anc * me black dashed lines denoted the upper (lower) 
allowed mass limits m up (m\ ow ). We also added a red arrow together 
with a red dotted line in order to denote the probability for the 
particular choice of the threshold, t = a, and for comparison we 
added a small bla ck arrow pointing to th e probability obtained from 
a GEV analysis (I Waizmann et al .1120121) . From the position of the 
black arrow with respect to the red dotted line, one can infer that 
the GPD delivers lower probabilities of existence than the GEV 
analysis. The reason and nature of this difference has already been 
discussed in Section 12.21 As shown in Fig. [2] the two approaches 
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Figure 5. CDFs (upper panels) and PDFs (lower panels) of the exceedance y for a survey area A s = 2 800 deg 2 and for five lower redshift limits in z e [0, 0.8] 
(the upper redshift limit is kept constant at z = 3), as indicated in the leftmost panels. The lines thicken with increasing lower redshift limit and the grey shaded 
areas show the allowed mass range due to uncertainties in the mass determination for the two most massive clusters in the combined ACT and SPT surveys. 
The red circles denote the different values of the CDFs and PDFs for the two clusters based on the different lower redshift limits. 
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give similar results for very rare clusters only. This is also the rea- 
son why the difference between the arrow and the red dotted line 
is smaller for SPT-CL J2106 in the right-hand panel compared to 
ACT-CL JO 102 in the left-hand panel. Since the redshift intervals 
have been chosen a priori, the former cluster appears to be rarer 
than the latter. At this point it should be repeated that both proba- 
bilities from the GEV and the GPD approach are correct, since they 
are not the answer to the same statistical question. 



5 ON THE APPLICABILITY OF THE GPD TO 
PARAMETER ESTIMATION 

Apart from theoretically modelling exceedances for a given cosmo- 
logical model, the GPD-based approach could be advantageous for 
the estimation of the GEV parameters a, J3 and y. Instead of only 
using block-maxima (the most massive clu sters observed in smaller 
patches) as suggested in Waiz mann et al.1 fcOllh . a GPD approach 
would use the information from all observed systems above a given 
high-mass threshold. This difference could be particularly impor- 
tant for mass-limited SZ surveys in the sense that information of a 
larger number of objects could be used for the parameter estima- 
tion. In the following, we will study in more detail the performance 
of the GPD approach for parameter estimation and eventually its 
usability as a cosmological probe. 

For the estimation of the distribution parameters of the GPD, 
we utilise the maximum likelihood estimation (MLE) method. The 
log-likelihood function for the observation of n excesses over the 
threshold t reads 

J -nlniS-(l + i)Z? = iln(l + f), for y * 0, 
\ -n\np-\YH=iyu for y = 0, 

(19) 

where p from equation (\3l contains the parameter dependence on 
a, ft and y. For the best estimates of the GPD parameters, one min- 
imizes - lnX for the given set of y im For the numerical minimiza- 
tion process, we utilized the MINUIT2 librar)fl. In the statistical 
literature it is very common to consider the GPD distribu t ion as a 
2-parameter distributi on of y and ft (see e.ff. iHiisler et al.1 J201lh ). 
However, as shown in I Waizmann et al.l bOllh . the location param- 
eter, a, is, among the three distribution parameters, the one with the 
strongest dependence on the underlying cosmological model and, 
thus, it is not desirable to mask this parameter by combining it with 
the parameters y and p to form the unified parameter p. Therefore, 
we will focus our study on the 3 -parameter case. 
In order to understand whether we can expect an improvement in 
the parameter estimation with GPD or not, we sample observations 
from the true calculated GPD distribution along the lines of Sec- 
tion 12.21 and use these samples to estimate the parameters using 
MLE. The results of this procedure are shown in the upper panels 
of Fig. [7] for a, p and y from left to right. We chose arbitrarily the 
redshift interval of 0.5 < z < 3.0, a threshold of t = 15.0 and an 
SPT-\ike survey area of A s = 2500deg 2 . The threshold is chosen 
to be high enough to assume the validity of the limit law of the 
GPD and to mimi c the set-up of the SPT high-mass cluster sample 
iFolev et alJl20TTh . In all three panels we show the relative differ- 
ence between the MLE estimated and the true underlying param- 
eters as a function of the number of observations or, more clearly, 
the number of clusters above the threshold. The black lines show 
the parameter estimates and the orange area denotes the 3<r error 

1 http://www.cern.ch/minuit 
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range. From the two rightmost panels, it can directly be inferred 
that the scale parameter, p, and the shape parameter, y, can, even 
for a fictitious large number of observations, not be reliably esti- 
mated. For the location parameter, the situation seems too be much 
better but, particularly for a small number of observations, the es- 
timate seems to be biased low and moreover the 3<r error range 
is quite large. This first result is sobering considering that at first 
sight the GPD based approach seemed to be advantageous due to 
the increased amount of objects. 

In order to compare the results of the MLE for the GPD with 
the performance of a pure GEV-based approach, we repeated the 
previous statistical experiment with a GEV distribution for the 
same redshift range. The log-likelihood function for the GEV case 
is given by 

mx = - w m,-g(i + I)i„(i + r^) + (i + r^p 

(20) 

where n is the number of observed clusters and u { - log 10 M t are the 
individual observed masses. In order to mimic the different method- 
ology of dividing the survey area into smaller patches, we divide 
the survey area, A s , into n v equally sized patches of area A p . We 
fix A p = 25 deg 2 such that in a SPr-like survey one would observe 
100 patches. The results of this procedure are shown in the lower 
panels of Fig. [7] again for all three GEV parameters or, p and y. 
The difference in the performance of the parameter estimation with 
respect to the GPD approach is substantial. The statistical errors 
are much smaller for a and p. Particularly for the location param- 
eter, a, percent-level estimation in 100 patches would be possible 
in this idealised case. The estimation of the scale, and especially 
of the shape parameter are less precise but still significantly better 
with respect to the GPD approach. Furthermore, the parameter es- 
timates are unbiased for the idealised GEV case, even for a small 
number of observations. 

In order to understand better how the achievable accuracy in 
the measurement of a compares to deviations from ACDM, we 
added in Fig. [8] the relative changes in the parameter a for varia- 
tions of cr 8 = 0.811 by +5 per cent and of the equation of state 
parameter w = -1 by +10 per cent, keeping the other cosmolog- 
ical parameters fixed, respectively. We also added a quintessence 
model with an inverse power-law potential (INV) an d a super- 
gravity model (SUGRA), iden tical to the ones used inlPace et all 
J20ld ). Based on the results of IWaizmann et al.1 J201lh . according 
to which the GEV-based approach is particularly sensitive to devia- 
tions from the ACDM model at high redshifts, we choose a redshift 
range of 1.0 < z < 3.0. The patch size was assumed again to be 
A p = 25 deg 2 . It can be seen that even for ~ 100 patches the Gne- 
denko approach allows good constraints on cr 8 , which will be of 
course degenerate with Q m0 . The constraints on w are less tight and 
would require the combination with other cosmological probes to 
constrain this parameter with a higher precision; for the INV and 
SUGRA models 300 - 400 patches would be sufficient to rule them 
out with this strongly idealised statistical experiment. 

With this small statistical experiment, we could show that the 
Pareto approach seems not to be a favourable approach to improve 
the GEV parameter estimates. On the contrary, our results confirm 
that the patch-based Gnedenko approach is by far superior for the 
estimation of the location parameter, or, which is the most inter- 
esting parameter for cosmological applications. Of course, obser- 
vational effects and biases will lower the performance in the esti- 
mation of or, yet from a statistical point of view the method based 
on the Gnedenko approach seems to be favoured. Even for a small 
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Figure 7. Relative differences between the MLE estimates c*mle> /?mle and 7mle (in the order of the panels) and the true underlying parameters a im& , J3 tme 
and ytrue as a function of the number of observations for the GPD-based modelling of exceedances over the threshold t = 15.0 (upper panels) and for the 
GEV-based modelling of the CDF of the most massive clusters in patches (lower panels). The black lines denote the MLE estimates and the orange area shows 
the 3<t errors. The survey area was assumed to be A s = 2 500 deg 2 for the GPD case and the patch size was assumed to be A p = 25 deg 2 for the GEV case. The 
redshift range is in both cases 0.5 < z < 3.0 and in the lower panels the vertical dotted line indicates what number of observations (patches) corresponds to the 
full sky. 



number of observations, the MLE estimates behave extremely well. 
With this solid statistical foundation the next step will be an appli- 
cation of the Gnedenko approach to real observables rather than 
cluster mass in order to examine how well the method performs 
when applied to real data. 



6 SUMMARY AND CONCLUSIONS 

In this work we have presented for the first time an application 
of the generalized Pareto distribution to model the exceedances of 
galaxy clusters over a high-mass threshold. The approach to model 
exceedances over high thresholds is very closely linked to the mod- 
elling of extreme values by means of the GEV distribution. Instead 
of calculating the distribution of the block maxima which corre- 
spond to the most massive galaxy cluster in a given cosmic volume, 
one models the distribution of all clusters that are found to be above 
a given high-mass threshold under the condition that this threshold 
is exceeded. We related the underlying cosmological model to the 
three GEV parameters and related those to the two GPD parameters 
that fully describe the distribution. 

We showed that, for a particular choice of the threshold (t = 
a), the CDFs of both the GPD and the GEV lead to basically iden- 
tical values if the galaxy cluster is very rare (existence probability 
< 10 per cent). For clusters that are less rare, both CDFs quickly 
deviate substantially from each other. However, it is important to 
note that both distributions are correct in the sense that they are an- 



swers to different statistical questions: the GEV distribution is the 
distribution of the most massive cluster to be found in a given cos- 
mic volume, whereas the GPD is the distribution of the exceedance 
of all clusters above a high-mass threshold under the condition that 
the threshold is exceeded. 

Based on the argument that, in contrast to almost all other 
types of cluster surveys, SZ ones exhibit a constant limiting mass 
out to high redshifts, we study the application of GPD for an SPT- 
like survey with a survey area of 2 500 deg 2 . We calculate the prob- 
ability distributions of the exceedances for a range of thresholds 
and redshift bins. As expected, the PDFs fall steeper to zero the 
larger the threshold is chosen for a fixed survey area and redshift 
range. The same applies if the threshold is kept fixed but one con- 
siders the volume of interest to be placed at higher redshifts. 

With the possibility to analytically derive the distribution of 
the threshold excesses, we apply the GPD approach to two SZ clus- 
ters, namely ACT-CL J0102 and SPT-CL J2106, that have been 
observed in the combined survey area of ACT and SPT. For our 
calculations we assigned to each individual system a priori a red- 
shift range for which we perform the analysis. This is done in or- 
der to avoid the bias that ar ises from t he a posteriori choice of 
the volume, as discussed in Hotchkiss (201 1). None of the two 
clusters is in tension w ith the ACDM cosmology, as discussed in 
Waizmann et al ] (120121) . and also the GPD-based approach leads to 
the same conclusion. This result is i n agreement with the conclu- 
sions drawn in the recent works of Chongchi tnan & Siikl (120121) . 
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Figure 8. Relative percent differences between <?mle and the true underly- 
ing parameters Qr tr ue as a function of the number of observations for GEV- 
based modelling of the CDF of the most massive clusters in patches. In 
addition, we added the expected relative differences for a 5 per cent in- 
crease (decrease) in erg (blue dashed line), 10 per cent increase (decrease) 
in w (blue short-dashed line), the SUGRA (blue dashed dotted) and the 
INV model (blue dotted line). The black lines denote the MLE estimates 
and the orange area shows the 3cr errors. The patch size was assumed to 
be A p = 25 deg 2 , which corresponds to a division of the SPT field into 100 
patches and the redshift ranges from 1.0 < z < 3.0. 



lHarrison & Colesl J2Q12h and lMenanteau et all J2012h who find no 
tension with ACDM for individual clusters. Since SPT-CL J2106, 
due to its position in the redshift interval, appears to be rarer than 
ACT-CL J0102, the GEV and GPD approach deliver very similar 
probabilities, as it has been discussed above. 

So far we summarized the results for the potential application 
to single systems. In the second part of this work, we discussed 
whether t he GPD might potenti ally be used as a cosmological probe 
or not. In Waiz mann et alj fcOllh . we proposed to utilise GEV as a 
cosmological probe by means of dividing the survey area in equally 
sized patches and to measure the mass of the most massive cluster 
in the patch. In this way it would be possible to reconstruct the dis- 
tribution of the maxima and to compare it with the theoretical ex- 
pectations. We could show that the position of the peak of the PDF 
is the most promising GEV parameter due to its strong dependence 
on the underlying cosmology. 

For a survey like the SPT one, the determination of the dis- 
tribution parameters via maximum likelihood methods should be 
superior in the GPD case with respect to the GEV one, since we ex- 
pect more clusters to be found above a threshold than block maxima 
by dividing the survey into smaller patches. The increased amount 
of information should in principle reduce the variance in the maxi- 
mum likelihood estimates of the parameters and therefore result in 
tighter constraints on deviations from the ACDM model. In order 
to test this naive assumption we performed a sampling experiment 
for which we created random realisations of the GPD and the GEV 
distributions and calculated the MLE-estimates for different sam- 
ple sizes. We found that the location parameter a, which is tightly 
related to the most likely maximum mass that should be found in a 
given volume, can be estimated with the highest precision with re- 
spect to the two other GEV parameters p and y in both approaches. 



However, in the GPD approach a much larger number of realisa- 
tions (clusters) is needed with respect to the patch-based GEV ap- 
proach. For the latter already ~ 100 patches are sufficient to reach 
a percent level accuracy on a. From this point of view, it seems 
that the GEV based approach is far superior to the GPD based one. 
The remaining challenge, however, will be to get observational bi- 
ases stemming from uncertainties in the mass measurements and 
the resulting confusion of less massive clusters as the most massive 
one. 

Thus, the main conclusions of this work can be summarized 
as follows. 

(i) The excess of very massive high-mass clusters can be analyt- 
ically modelled with the generalized Pareto distribution. 

(ii) For rare clusters, the GPD and the GEV based modelling 
lead to identical existence probabilities for extreme galaxy clusters. 

(iii) Modelling of exceedances by means of the GPD approach 
seems to be disfavoured as a cosmological probe when compared 
to the patch-based GEV approach. 

(iv) Utilising a MLE approach, the location parameter, a, can 
be estimated under idealised circumstances on the percent level for 
less than ~ 100 patches. 

The last point indicates that, from a statistical point of view, the 
patch-based method can be easily applied to relatively small sur- 
vey areas, particularly if the focus lies on high-z systems. The GPD 
approach, however, seems only to be usefully applicable for stud- 
ies of single objects but not as a cosmological probe. In order to 
observe the large number of clusters required to get an accurate es- 
timate of Qf, the threshold would have to be substantially lowered 
and this would violate the assumption on which the GPD is based. 
In addition very large survey areas would be required as well, which 
makes the GEV based approach more appealing for a real applica- 
tion of the method. 
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